-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pip metadata refactoring #680
pip metadata refactoring #680
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor nitpicks.
414e227
to
59e4f7e
Compare
59e4f7e
to
c66f55b
Compare
There's too much going on in this single commit, so it's difficult to follow all the changes in the diff, please introduce them gradually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very much in favour of this work, needs some polishing though.
c66f55b
to
af59166
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After having carefully gone through the unit tests which I didn't do in my first round of reviews I think we're actually opening us up for potential issues with pyproject.toml setup.py etc. mixed metadata.
I think that while we may cosmetically change the code and break the logic into smaller helper functions, we'll have to test the metadata querying in the compound way we're doing now.
That's a good point. The fact that we were mixing metadata from multiple project configuration files was there reason why we ended up with extremely complicated and long unit tests. Splitting name and version into multiple configuration files makes no sense on its own. In the end, we only need the |
af59166
to
fa86d7b
Compare
More changes have accumulated, must take another look
Well, what this PR just did is a breaking change from the behaviour POV without any warning. There probably was a reason we did this way in the past. It's true that mixing metadata is wrong, however, we allowed it and it also wasn't against the ecosystem practices, was it? (although very unexpected without a doubt). So this can't be compared to our recent dropping of Go vendoring flags, because those actually allowed projects to use incorrect repo setups which would not be buildable using standard toolkits the way users intended to in the first place, I'm not sure that's the case here. If we end up wanting this, then you'll have to accompany this change with a docs update (we'll also need to mention that in the release notes). That said, although I'm definitely not a fan of breaking backwards compatibility, strictly speaking SemVer [1]:
and so I won't stop this work based on this argument, but we'll probably need more voices in favour. [1] https://semver.org/#semantic-versioning-specification-semver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still stuffed everything into commit 1. The changes can be introduced gradually by adding one unit test at a time and turning off that particular test area in the more complex unit test you're trying to kill. That way, you'd keep most of the things as is until you're ready to switch and then remove everything you don't need in a single commit, it can be done and the diff will be much more readable IMO. I'm not fond of trying to argument squashed changes by a complex unit test that isn't easily to be replaced (as I mentioned one option how to do it) as a justification - things can be made cleaner for the reader/reviewer.
I'll try my best |
Also, it might be worth discussing setup.py as:
We can at least add a warning when extracting metadata from setup.py |
People should be aware of |
@brunoapimentel @a-ovchinnikov @taylormadore @ben-alkov any opinions on simpler yet backwards incompatible behaviour? |
Original behaviour looks somewhat strange to me -- I would think that a package which has name defined in one config, and version in another is malformed and will cause other issues as well. While possible I don't believe it is probable to find such a package. I am generally in favor of making a live test. The change breaks the original behavior, but I am not sure it was correct to begin with. With the code as we had it before we stopped at the first found pair of name and version, but technically every location could have defined its own name and version, so a sequence of
would have resulted in (foo, 1.0.0) and there is no good way of telling if it is the correct (name, version) rather than (baz, 2.3.4). Personally I would have rejected a package that has mismatching names or versions in its definition, or at least emitted a big warning. This change makes the code a little cleaner so I am in favor of it. |
d64087f
to
84a0060
Compare
84a0060
to
043ac2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs commit msg adjustment wrt/ this being a breaking change, otherwise ACK.
Signed-off-by: Michal Šoltis <[email protected]>
There is no context within the log warning. We don't warn users about other things when parsing package metadata (for example deprecation of setup.py). The version is an optional attribute in the SBOM. Even cachi2 uses "dynamic version". Signed-off-by: Michal Šoltis <[email protected]>
The commit follows the previous one, that drops a warning when processing metadata from pyproject.toml. This piece of code is no longer needed. Signed-off-by: Michal Šoltis <[email protected]>
Signed-off-by: Michal Šoltis <[email protected]>
…function Signed-off-by: Michal Šoltis <[email protected]>
Do not mix name and version from multiple config files (pyproject.toml, setup.cfg, setup.py) and with the name from git origin remote. Now, the current behavior parses one config file at a time, and then tries to get the name + version from it. If the name is there, both name and version are returned regardless of the version presence. This metadata will be used in the SBOM for the component representing the processed package. Even though, the probability of affecting users is low, it is considered as a breaking change since the component PURL might be different now. Therefore, it should be mentioned in the release notes. The commit also drastically simplifies unit tests to speed up overall time of unit tests while preserving the same coverage. Signed-off-by: Michal Šoltis <[email protected]>
043ac2b
to
1240a0c
Compare
5a65f16
My local approximate results:
(venv) ~/cachi2 (main) $ time tox -e py312
(venv) ~/cachi2 (pip-refactoring) $ time tox -e py312
Maintainers will complete the following section
Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:
/ok-to-test
(as is the standard for Pipelines as Code)